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Abstract 

This document defines the high level metadata necessary to describe the 
physical parameter space of observed or simulated astronomical data sets, 
such as 2D- images, data cubes, X-ray event lists, IFU data, etc.. The Char- 
acterisation data model is an abstraction which can be used to derive a struc- 
tured description of any relevant data and thus to facilitate its discovery and 
scientific interpretation. The model aims at facilitating the manipulation of 
heterogeneous data in any VO framework or portal. 
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A VO Characterisation instance can include descriptions of the data axes, 
the range of coordinates covered by the data, and details of the data sampling 
and resolution on each axis. These descriptions should be in terms of physical 
variables, independent of instrumental signatures as far as possible. 

Implementations of this model has been described in the IVOA Note 
available at: 

http : //www. ivoa.net/Documents/latest/ImplementationCharacterisation.litml 

Utypes derived from this version of the UML model are listed and com- 
mented in the following IVOA Note: 

http : //www . ivoa . net/Documents/latest/UtypeListCharacterisat ionDM . html 

An XML schema has been build up from the UML model and is available 

at: 

http : //www. ivoa.net/xml/Characterisation/Characterisation-vl . 11 .xsd 

1 Status of this document 

This document has been produced by the Data Model Working Group. It has 
been reviewed by IVOA Members and other interested parties, and has been 
endorsed by the IVOA Executive Committee as an IVOA Recommendation. 
It is a stable document and may be used as reference material or cited as 
a normative reference from another document. IVOA's role in making the 
Recommendation is to draw attention to the specification and to promote its 
widespread deployment. This enhances the functionality and interoperability 
inside the Astronomical Community. 
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2 Introduction 



This document defines an abstract data model called "Data Set Characterisa- 
tion" (hereafter simply "Characterisation"). In this Introduction we present 
requirements and place the model in the broader context of VO data models. 
In Section 3 we introduce the concepts (illustrated with some examples) and 
discuss their interactions. In Section 4 we present a formal UML class model 
using the concepts defined earlier. XML and VOTABLE serializations are 
presented in Section 5 and the Appendices give further examples. 

2.1 The purpose of the Characterisation model 

Characterisation is intended to define and organize all the metadata necessary 
to describe how a dataset occupies multidimensional space, quantitatively 
and, where relevant, qualitatively. The model focuses on the axes used to 
delineate this space, including but not limited to Spatial (2D), Spectral and 
Temporal well as an axis for the Observable (e.g. flux, number of 

photons, etc.), or any other physical axes. It should contain, but is not 
limited to, all relevant metadata generally conveyed by FITS keywords. 

Characterisation is applicable to observed or simulated data 1 but is not 
designed for catalogues such as lists of derived properties or sources (see 
Section 2.3). 

The model is intended to describe: 

• A single observation; 

• A data collection; 

• The parameter space used by a tool or package accessed via the VO. 

The model describes the available data, not its history. For instance, 
spatial resolution expresses the level of smearing of the true sky brightness 
distribution in a data set without differentiating between contributions from 
different atmospheric, instrumental and software processing effects (see Sec- 
tion 2.3). 

Characterisation has to satisfy two sets of requirements: 

I Data Discovery requirements: 
This model prescribes elements for use in requests to databases and 
services and thus forms a fundamental part of the standards for VO 
requests. The use of this model should enable a user 2 to select rel- 
evant observations from an archive efficiently The selection will be 

1 Unless otherwise stated, we use the terms "dataset", "observations" etc. to mean any 
applicable observed or simulated data. 

2 A user is either a human or a software agent 
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based purely on the geometry of the observations, that is, how and 
how accurately the multidimensional space is covered and sampled. 
Discovery may only require a simplified overview (e.g. position, wave- 
band, average spatial resolution). Data providers may opt for the in- 
clusion of data where there is insufficient information to respond to 
certain parts of a query. Eventually, it should be possible for a client 
to generate a detailed multidimensional footprint of an observation. 
For example: 

— What observations from a particular archive are likely to have 
covered a specific VO Event? (Spatial and Temporal Coverage) 

— Which CCD frames in a mosaic actually cover the position of a 
particular galaxy? (detailed Spatial Coverage) 

— What observed spectra have a resolution comparable with a given 
simulated spectrum e.g. matching the Shannon criterion? (Sam- 
pling Precision). 

II Data Processing/Analysis requirements: 

Characterisation should detail the variation of sensitivity on all rele- 
vant axes (e.g. variation of sampling or sensitivity across the field of 
view, detailed bandpass function), in order to provide information to 
an analysis tool or for reprocessing. 
Errors may be provided for any or all axes. 

Version 1 will fulfill all Data Discovery requirements, and allow some 
simple automatic processing such as cross-correlation and data set compar- 
isons. Full implementation of Data Processing/Analysis requirements will 
only become available in a future version of this model. 

2.2 Scope of the document 

This document defines metadata items and organisation patterns for char- 
acterizing data products and their properties in the VO. It identifies some 
major contexts in which these patterns play a crucial role and it shows how 
metadata descriptions can be constructed in these contexts, in a form that 
is adjusted to the requirements for distribution and analysis of astronomi- 
cal observations. However, the precise application of these patterns in other 
contexts may be different from these and we do therefore as yet not prescribe 
the precise syntax of characterization metadata in all contexts. That is left 
to the controlling documents. 
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2.3 Links to other IVOA modeling efforts 

Characterisation arose out of the "Observation Data Model" , a high level de- 
scription of metadata associated with observed data, described in an IVOA 
note available at http://www.ivoa.net/Documents/iatest/DMObs.htmi. The con- 
nection is summarised in Fig.l. It became obvious that there was an urgent 
need for a model to characterise the physical properties of data, alongside 
Provenance, DataCollection, Curation etc. (which provide instrumental, so- 
ciological and other information). For example, Provenance will be linked 
with Characterisation to provide the telescope location (needed for some co- 
ordinate transformations), calibration history, etc. Interactions are organised 
so that each Observation object of the Observation data model will have a 
Characterisation object encompassing the metadata relative to the physical 
axes along which the measurements are spanned. 

Characterisation complements and extends some of the metadata adopted 

by the VO Registry (http://www.ivoa.net/Documents/latest/RM.html), provid- 
ing the finer level of detail needed to describe individual datasets. Concur- 
rently to the Characterisation DM, the Spectrum modeling effort appeared 

(http://www.ivoa.net/Documents/latest/SpectrumDM.html), focusing on spectral 

data sets. It partly re-uses the Characterisation metadata tree representation 
for data spanned along the spectral axis. Data models for Catalogues and 
Sources are also being developed. Ideally, all these models must be mutually 
consistent and employ the definitions supplied by the STC DM and/or by 
basic models describing astronomical quantities as was planned within the 
Quantity DM. However some overlap and duplication may happen to allow 
data and service providers to use the parts they need without excessive effort. 

3 Exploring the Characterisation concepts 

3.1 Overview: a geometric approach 

We introduce the physical axes used to define the N-dimensional space oc- 
cupied by any data set or required for interpretation. When considering a 
typical astronomical observation, we have identified various Properties: 

• Coverage: describes what direction the telescope was pointing in, at 
which wavelengths and when; and/or the region covered on each axis. 
This is described in increasing levels of detail (see Section 3.6.1) by: 

— Location 

— Bounds 

— Support 
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Figure 1: Interaction between the Observation and Characterisation data models: 
the Characterisation DM focuses on the physical information relative to an Obser- 
vation , a major class of the Observation Data Model. Characterisation DM has 
been identified as an important building block within the Observation Data Model, 
a large scope modeling effort to be completed in the future. Data management 
aspects such as VO identifier, data format, etc.. are handled in other metadata 
description like Resource Metadata or parts of the Observation model. 

— Sensitivity 

If the data contain many small regions then the Bounds may be quali- 
fied by a 

— Filling Factor 

(especially if the Support is not precisely defined) . 

• Sampling Precision: describes the sampling intervals on each axis; 

• Resolution: describes the effective physical resolution (e.g. PSF, LSF, 
etc.). 

Each property can be related to one or more physical axes, described in more 
detail in Section 3.6. For each axis: 

• Accuracy: describes the measurement precision, see Section 4.2.2. 
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3.2 Examples of Characterisation 



The tables below illustrate how the spatial, temporal and spectral domains 
and the observable quantity of some typical data sets can be described, at 
various levels of complexity, using the properties from Section 3.1. Table 1 
shows some of the Characterisation metadata for an X-ray event list. Ad- 
ditional examples are presented in Appendix C : Table 2 for a 2-D image, 
Table 3 for a ID spectrum, Table 4 for an IFU Dataset, Table 5 for a radio 
interferometry image service and Table 6 for simulated data. 

In some of these examples, some concepts are interdependent, discussed 
further in Section 3.4.1. All these concepts can be applied to any data set but 
some elements may not have defined values, or the origin may be arbitrary, for 
example the spatial location of a generic simulated galaxy cluster (Table 6). 



3.3 Structure and development strategy 

Characterisation provides a framework to present the metadata necessary to 
specify a dataset in a standard format and to make any interrelationships 
explicit. The description can be presented from the perspective of the Prop- 
erties or the Axes in a succession of progressively more detailed description 
layers. This will allow evolution of the model in three independent directions: 
new properties may be added as well as new axes, and if necessary new levels 
of description may be considered without breaking the overall structure. 

3.4 The Axis point of view 
3.4.1 Axes and their attributes 

The physical dimensions of the data are described by axes such as: Spatial, 
Spectral, Time, Velocity, Visibility, Polarisation, Observable. 
We recommend that data providers use these axes names but this is not 
compulsory (e.g. FITS names can be used). The data provider will be 
required to supply a UCD for each well as the units. This would 

help to disentangle ambiguous or unprecise metadata for data retrieval or 
recognition by standard software. There is no limit on the number of axes 
present and they may be dependent or overlapping (e.g. one frequency axis 
and two velocity axes, representing the velocities of two separate molecules 
with transitions at similar frequencies). 

Some axes may not even be explicit in the data, but are implicit, present 
only as a header keyword or elsewhere. For example, a simple 2D sky image 



9 



Axes 

Properties 


Spatial 


Temporal 


Spectral 


Observable 
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of energy 


Out-of-time 
events 

(saturation); 
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factor 


Good pixel 
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Live time 
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not 
used 




Resolution 


PSF (x,y) 
or its FWHM 


Time 
resolution 


RMF (spectral 
redist. matrix) 


SNR 


Sampling 
Precision 
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(x,y) 


Frame 
time 


PI bin 
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quantization 



Table 1: Property versus Axis description of metadata describing an X-ray 
CCD Event List. This also characterises the potential images and other 
products which can be derived. During exposure, the instrument moves with 
respect to the sky, so, for example, the sensitivity is a function of the support 
on the first three axes. 
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usually has celestial coordinate axes, but the time and spectral axes may not 
be present in the main data array although the observation was made using 
a finite integration time and wavelength band (a single sample on each of 
the temporal and spectral axes). These implicit axes may be represented in 
Coverage to provide their location an/or bounds, or even, for purposes such 
as color corrections, their sensitivity as a function of the coordinate within 
the bounds. 

3.4.2 Axes flags 

Axes flags (Section 4.2.1) are used to indicate Boolean and other qualifying 
properties. These include whether the axis represents a dependent variable 
(e.g. the Observable), the calibration status and whether the data are un- 
dersampled. 

3.5 Accuracy 

Accuracy characterises any uncertainties associated with each axis (Sec- 
tion 4.2.2) - astrometric uncertainties are attached to the Spatial axis, pho- 
tometric to the Observable etc. Note that this is a level of detail distinct 
from the assessment of the overall accuracy of data provided by the Registry 
metadata. 

3.6 The Property point of view 

The main properties needed for data description and retrieval are categorized 
under Coverage, Resolution, and SamplingPrecision, introduced in Section 
3.1. 

The values of the properties characterising an Observation may be derived 
from instrumental properties given in Provenance or from other Characteri- 
sation features. For example, high energy missions move the telescope during 
the observation (Table 1), leading to a time- variable mapping from detector 
to celestial coordinates (the 'aspect solution'), giving a spatially variable ef- 
fective exposure time derived from the temporal bounds multiplied by the 
filling factor, or the sum of all the support intervals weighted by sensitiv- 
ity, or derived from the sampling precision and period within the bounds. 
The sensitivity across the spectral band may be a function of spectral posi- 
tion (ARF). Such dependencies should be restricted to areas of significance to 
users, such as the Sensitivity class. At present, a single value, or the extrema, 
can be given for each element; more complex formulae will be available in a 
future version of Characterisation. 
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3.6.1 Coverage 

Coverage has several levels of depth, providing a range of detail to meet the 
needs of any user/developer, illustrated in Fig. 2. The simplest approxi- 
mation to a spatial field of view presumes that a sharp-edged region of the 
celestial sphere has 100% sensitivity inside and 0% outside. In reality the 
transition is fuzzy and the region may be irregular and contain gaps. For 
example, some applications only need to know what range of coordinate axes 
values might contain data; others need to know the variation in (flux) sen- 
sitivity as a function of position on an axis. Coverage provides answers to 
these questions at different levels of precision, with the idea that software 
implementations will be able to convert between the levels. 



Location 



Bounds 




Figure 2: Illustration of the different levels of description, left: for a 1- 
dimensional signal, right: for a 2D signal. 

Coverage is described by four layers which give a hierarchical view of 
increasing detail: 

1. Location: The simplest Coverage element is the Location of a point 
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in N-dimensional parameter space, such as an image described by a 
single value each of RA, Dec, wavelength and time. These are fiducial 
values representative of the data. A precise definition (mean, weighted 
median, etc.) is not required, but Location can serve as a reference 
value or origin of coordinates in frames with no absolute position (e.g. 
Table 6). 

2. Bounds: The next level of description is the SensitivityBounds, i.e. a 
single range in each parameter providing the lower and higher limits of 
an N-dimensional "box". The scalar intervals between the limits (the 
sizes and centres of each box-side) should also be available if required. 
The Bounds are guaranteed to enclose all valid data but there may be 
excluded edge regions for which there is no valid data, such as (on the 
wavelength axis) the 'red leak' end of a spectral filter. These provisions 
satisfy the intent of typical data discovery queries. 

3. Support: Mathematically, the support of a function is the subset of its 
domain where the function is non-zero. Here, Support describes quan- 
titatively the subsets of space, time, frequency and other domains, onto 
which the observable is mapped, where there are valid data (according 
to some specified quality criterion). Support may include one or many 
ranges on each axis (e.g. Table 4). 

4. Sensitivity: Sensitivity, (unlike the previous 'on/off' properties), pro- 
vides numerical values indicating the variation of the response function 
on each of the axes, such as the relative cell-to-cell sensitivity in the 
data. This includes filter transmission curves, flat fields, sensitivity 
maps, etc. The final limits on Sensitivity are determined by the bounds 
of the Observable; for example, the minimum and maximum given by 
a single count and by the saturation level for some types of detector. 

The Bounds may also contain the 

• Filling Factor sub-level, which gives the useful fraction of Bounds on 
any axis. It may not be appropriate to detail multiple small interrup- 
tions to data (for example detectors requiring dead time between each 
sample) if it is conventional for analysis systems to solve the problem 
using a statistical correction based on the Filling Factor. Very regular 
filling may also be described by Sampling (see below). Even if Support 
provides a complete description, the Filling Factor may be used to rank 
the suitability of data during discovery. 

A method should be provided to derive the Filling Factor from the Sam- 
pling Extent and Sample Precision (Section 3.6.2) if these are given, but if 
all three values are entered separately there needs to be a means of checking 
for consistency. 
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3.6.2 Resolution and Sampling Precision 

Resolution is often a smoothly decaying (e.g. Gaussian) function but the 
data product is subject to further discrete Sampling, e.g. CCD pixels, Ta- 
ble 2. Resolution may, however, be a top hat function determined by the 
Sampling interval - e.g. the temporal resolution of an image made from a 
single integration. We maintain a distinction between the concepts to facili- 
tate different requirements in data processing, whether during data discovery 
services which allow resampling or flexible resolution (Table 5), or during 
post-discovery processing (Table 4). 

• Resolution Resolution is usually the minimum independent interval of 
measurement on any axis. Mathematically, if the physical attributes 
(e.g. position, time, energy) of the incident photons, or other observ- 
able, are x (e.g. x = energy, x\ = RA, x 2 = Dec, x 3 = time, etc.), 
and the measured attributes are y (e.g. y\ = spectral channel, 1/2,1/3 
= pixel position, y 4 = time bin) then given a flux of photons S'(x) the 
detected number of photons is 

N( Vl ,y 2 ,...) = N(y) = J S(x)A(x)R(x, y)dx 

where A is the probability that a photon is detected at all (the quantum 
efficiency) and R(x\, x 2 , yi, y 2 , •••) is the smearing of measured values 
(PSF, line spread function, etc.). 

In the most detailed case, R(x, y) may be a complicated function, such 
as a PSF which varies as a function of detector position and energy. The 
first level of simplification is to specify a single function which applies 
to the whole observation - e.g. a single PSF. This function may either 
be provided as a parameterized predefined function (e.g Gaussian) or 
as an array. The concept of Resolution Bounds provides the extreme 
values of resolution (see Table 5) 

The final level of simplification is to give a single number characterising 
the resolution, such as the the standard deviation of a Gaussian PSF. 

• Sampling 

Sampling (or pixelization or precision or quantization) describes the 
truncation of data values as part of the data acquisition or data pro- 
cessing. If sampling is non-linear, simplification may be necessary, by 
giving limiting values or a single 'characteristic sampling precision'. 
The Sampling Period gives the sample separation and the Sample Ex- 
tent shows the deviation from the pure "Dirac comb" case. The Nyquist 
parameter - the ratio between the resolution FWHM and the Sampling 
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period - will also be provided by a method. The Sampling flags (Sec- 
tion 4.2.1) provide a simple guide as to whether these properties are 
significant. 

3.7 Presentation of layered information 

The layered structure allows tasks to retrieve only the metadata which is 
actually required. The lower levels can be very detailed, for example the 
variation in Sensitivity to the Observable(s) along the spatial, spectral and 
other axes, or the variation of the resolution within the field of view. This 
could take various forms: 

• A simple value or range 

• An analytic function of other property values 

• A variance map for 2D data 

• A look-up table for the bandpass correction to ID spectral data 

The more complex properties may be provided using pointers to ancillary 
data with the same types of axes and dimensions as the observation itself, e.g. 
a weight map packaged with a 2D image; this capability exists in the first 
version of this model. The provision of "attribute formulae" or attributes 
pointing to functional descriptions, such as the aspect solution for an X-ray 
observation, is left for the future development of Characterisation; a first 
step may be to decompose a complex coupled description into non-coupled 
expressions. Where it is possible to provide separate values for interdepen- 
dent elements (see also the end of Section 3.6.1), there must be a validation 
method to avoid contradictions. 

A later version of the model will also allow links to other aspects of the 
Observation model, external calibration and documentation. Advanced VO 
tools could use such metadata to recalibrate data on demand. Characteriza- 
tion is used to describe potential as well as static data products (e.g. Tables 1 
and 5). It could therefore also provide pointers to Registry entries indexing 
tools and services that could be launched on the fly for extracting images 
etc. from event or visibility data or atlas cut-outs. 

4 The Model 

4.1 The role and structure of the Model 

We use UML diagrams to describe the organisation of Characterisation meta- 
data following the Properties/Axis/Levels perspective. The model offers 
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different views of the characterisation concepts. Figure 3 shows the rela- 
tionships between the main concepts. The AxisType box attached to each 
property class represents the axes along which the property (e.g. Resolution) 
is assessed; for example, there can be one Resolution class for each relevant 
axis. Fig 4 illustrates how the properties of the data are gathered under the 
Characterisation container class. The Coverage class is shown with the four 
increasingly detailed properties introduced in Section 3.6; such a Character- 
isation tree is available for each type of axis. 



container 
has a {for ehcfi axis) 



piupert^O..! 



Characterisation 
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\ AxisType; 
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Error 
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Figure 3: This UML class diagram emphasises the Property /Axis perspective. 
The Characterisation class is a container that gathers the properties for each 
axis. The axis is described by the Characterisation Axis class. All relevant 
axes for one observation/ dataset are linked to the Characterisation class. The 
AxisType template parameter for each Property allows to link properties to the 
corresponding Axis. The Accuracy class, linked to the Characterisation Axis 
class, gathers different types of Error descriptions (systematic, statistical) as 
well as quality flags. 



4.2 Axis description 

All the information related to an axis is gathered within the Characteri- 
sationAxis class. This can have common "factorised" attributes applicable 
to the property layers on that axis (Section 3.1). It contains the name of 
the axis, units, UCD as well as a holder for the STC coordinate frame (see 
Section 4.4) which also provides the base class for the observatory location 
(Observation - Provenance model). 

If a deep level (higher number, Section 3.6) object, e.g. Sensitivity, needs 
to have its own axis description, this can be defined locally, overriding the 
factorised top level CharacterisationAxis object. The redefinition can be 
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Figure 4: The layered structure of Characterisation: This diagram sunthe- 
sises the Property /Axis /Layer approach. The concepts are represented in yel- 
low. The coarse description is designed by the first level (blue boxes), while the 
pale blue ones represent the complementary metadata. The Bounds, Support 
and Sensitivity classes are nested levels of detail to add knowledge about the 
Coverage of an Observation. Symmetrically, Resolution and Sampling may 
also have the J^-level structure of description. The complete Characterisa- 
tion for one observation is obtained by filling the tree for each relevant axis: 
spatial, spectral, temporal, etc. 
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partial, e.g. a change of unit or a change of spatial orientation requiring a 
new CoordSystem element. 

4.2.1 Flags and other qualifying information 

Other elements in the CharacterisationAxis class include the number of bins 
present on this axis, and flags to indicate the calibration status, independency 
and sampling properties of the described in Section 3.4.2 

4.2.1.1 Independent or dependent status Axes may include both 'inde- 
pendent' variables (which may have associated errors) and the "Observable" 
axis or axes which represent phenomena measured along some other axes. 
For instance, in a 3D datacube of the sky, the Spatial axis is an independent 
axis (flag TRUE), as is the (implicit) Spectral axis, but the Flux axis is de- 
pendent (flag FALSE), and the velocity axis is dependent on the frequency 
axis. 

4.2.1.2 Calibration status The CharacterisationAxis object in the Char- 
acterisation model provides a calibration status flag for each axis, so that a 
user can insist on calibrated data only where necessary. The CalibrationSta- 
tus is given separately for each type of characterisation axis and can be 

• UNCALIBRATED: not in units which can be directly compared with 
other data (but often still useful, for example the presence of spectral 
lines at known wavelengths can give a redshift regardless of absolute 
flux densities). 

• CALIBRATED: in reliable physical units or other accepted units such 
as magnitudes. 3 

• RELATIVE: calibrated to within a constant (additive or multiplicative) 
factor which is not precisely known, such as arising from uncertainty 
in the flux density of a reference source. 

• NORMALIZED: dimensionless data, divided by another data set (or a 
local extremum). 

The calibration process itself is described elsewhere in the Observation Data 
Model (Section 2.3). 

3 In such cases the coarser levels of description should also be given in physical units and 
the need for a tool such as a look-up table of zeropoints etc. and conversion algorithms 
has been identified. 
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4.2.1.3 Sampling status 



• Undersampling: TRUE if the sampling precision period is coarse com- 
pared to the resolution and the precision of a single data value is limited 
by the sampling; FALSE if the sampling precision period is small com- 
pared to the resolution and precision is limited by the resolution 

• Regular sampling: TRUE if the pixellation or binning is close to linear 
with respect to the axis world coordinate (so that an accurate position 
can be obtained by counting samples from a Bound); FALSE if this 
would introduce an error significant with respect to other uncertainties. 

• The total number of samples along each axis may be given, normally 
used for multiple regular sampling. 

4.2.2 Errors in Characterisation: the Accuracy class 

The values along Coordinate axes and measurements of Observables may all 
suffer from systematic and statistical uncertainties. Errors may be in the 
units of the axis or may be represented by quality flags. These Error classes 
are gathered in an Accuracy object (linked to the CharacterisationAxis ob- 
ject, see Fig. 5, and STC data model elements, see Section 4.4)). Accuracy 
supports multiple levels of description, analogous to Coverage. The uncer- 
tainty in the position or measurement on any axis can be described by a 
typical value, by the bounds on a range of errors, and/or by very detailed 
error values for each sampling element (e.g. pixel). 4 A pointer may be 
provided to error maps packaged with the data, as described for the more 
detailed levels of Coverage. 

4.3 Navigation in the model: by axis or by properties? 

The structure of Characterisation is clearly hierarchical with the character- 
isation class as the root element. The model can be serialised using two 
alternative sets of primary elements: 

• Properties, with the corresponding classes for each axis attached; used, 
for example, to represent data where the axes values are interdependent 
(e.g. Table 1); 

• Axes, factorising each description into the multi-layer property levels; 
this provides more compact XML. 

4 Measurement errors are distinct from any 'fuzziness' in the values provided by the 
coarsest levels of Characterisation, e.g. Location may be an arbitrary approximation 
(Section 3.6.1), but that kind of uncertainty is catered for by going to deeper levels of 
Characterisation, and by the concept of Region of Regard in the Registry Resource model. 
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Figure 5: This class diagram illustrates the CharacterisationAxis class and 
its relationship with the Accuracy class, which encompasses various types of 
errors such as systematic or statistical. 
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Either structure could be applied to the examples tabulated in Section 3.2. 
This UML model could be used to build two different XML schemas, giving 
access primarily by property or by axis. Here, we present the "Axis First" 
serialisation only; the "Property First" serialisation will be presented in the 
next version of this model. 

4.4 Implementing the model using STC elements 

STC, the metadata scheme for Space-Time Coordinates (see 
http://www.ivoa.net/Documents/iatest/STC.htmi) encompasses the description 
of most of the Characterisation axes examples in Section 3.4.1 with the ex- 
ception of Observable. Sensitivity is the only Property not present in STC. 
However, the full STC structure cannot simply be reused, as it does not 
have the flexibility needed to deliver the alternative schemata for both multi- 
layered views presented in Sections 4.1 and 4.3. We do use STC intermediate 
level objects as building blocks for the Characterisation model. 

The STC:AstroCoordSystem object is needed as a reference to define the 
Coverage axes. STC substructures may be reused in the following way: 

• Location implements STC:AstroCoords 

• Bounds encapsulates STC basic types, some STCdnterval elements and 
STC:Coords into a structure similar to STC:AstroCoordArea. 

• Support uses STC:AstroCoordArea 

• Resolution ResolutionRefval can be implemented via adhoc types using 
STC:CResolution elements 

• SamplingPeriod and SampleExtent encapsulate CPixSize elements from 
STC. 

This is represented for the spatial axis using implementation links in the 
UML diagram in Fig. 6. 

In simple cases data handlers will probably reuse predefined elements 
included from an external STC library. For example, CharacterisationAxis 
includes the STC elements for CoordSys and the (possibly variable) space- 
time coordinates of the ObservatoryLocation 5 or of the origin of coordinates 
(e.g. for barycenter-corrected data). 

Many parameters (i.e. most numerical-valued elements at a finer level 
than Location) are customarily expressed either as maximum and minimum 
values or as a centre and scalar range (or both). In some cases an array 
of such values is needed, e.g. 2 dimensions on the spatial axis in most but 

5 This should, where necessary, be consistent with the Provenance section of the Obser- 
vation model (Section 2.3). 
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Figure 6: Expressing the spatial properties as a subtree of Characterisation 
. Here is an example of how STC components (in pink italics) may be used 
to implement the different levels of the Coverage description. The Location 
element uses a STC:AstroCoords. Bounds encapsulates STC basic struc- 
tures like STCTnterval elements and STC:Coords in a structure similar to 
S TC:A stro Coord A rea . 
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not all cases; upper and lower bounds to (separately) the major and minor 
axes of Resolution in Table 5; higher dimensionality is possible such as the 
inclusion of beam position angle in this Resolution example. 

The Resolution and Pixel-Size concepts are represented in STC at a deep 
level inside the Coordinates class (together with the Name/ Value/Error in 
the Coordinate object). This allows any coordinate to be expressed to the 
appropriate degree of numerical precision. Characterisation needs to allow 
selection of metadata by resolution, which therefore must be accessible at 
the upper level of description and is coded as a Property along one 
CharacterisationAxis, as well as SamplingPrecision. 

Since the space, time and spectral axes are particularly important for 
astronomy, we recommend that implementations include a method to return 
an STC::AstroCoordSys object, which will only succeed if a complete and 
consistent space-time-spectral description is present. This may be nominal 
or arbitrary for some axes e.g. for simulated data. 

5 XML Serialization 

5.1 XML schema (Axis First) 
5.1.1 Design of the schema 

Due to the hierarchical nature of the Model, the XML serialization of Charac- 
terisation is based here on a single tree. The appropriate elements are taken 
from STC as described in Section 4.4. The root element called "Characteri- 
sation" is the aggregation of a set of CharacterisationAxis elements 6 for each 
of the axes. The CharacterisationAxis element contains all axis information 
like an obvious label ("spatial", "temporal"), coordinate system, units , etc. 
Coverage implements different elements according to the four levels of de- 
scription detailed in Section 3.6.1. Lower levels of these properties along one 
particular CharacterisationAxis may reuse the axis parameters defined into 
the top-level objects for that axis or redefine their own axis parameters(units, 
coordsystem, ...) locally, as described in Section 4.2. 

A full XML serialisation is provided, as an XML schema, for simple ob- 
servations, at the following site: 

http : //www. ivoa.net/xml/Characterisation/vl . 11 

An XML instance document MPFS-vl.ll.xml describing an IFU dataset characteri- 
sation is available at 

http : //www. ivoa.net/internal/IVOA/CharacterisationDataModel/. 

6 These elements are containers gathering the result of the dynamic grouping of prop- 
erties for a given characterization axis 
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Most of the usual (spatial, time, spectral, velocity) coordinates information re-uses 
the STC coordinates definitions and structures. For simple values, the present version 
of the schema makes use of STC fine grain structures like CPixsize or CError. Such 
terminal leaves elements in the Characterisation XML tree could be adapted and re-use 
simple existing structures (value, unit, semantic tag, coding format) when available in a 
standard. 

5.1.2 Building blocks of the schemata 

In order to illustrate how the XML schemata is derived from the UML Model, building 
blocks of the Schemata, corresponding to some main classes of the UML diagram are 
shown here. 

The principle is to map the main classes in XML elements, building up a hierarchy 
from the most englobing concept down to more specific ones. Aggregated classes are easily 
translated as aggregated subelements. The attributes of an UML class are also coded as 
sublevel elements. 

The translation from UML to XML used in this serialisation applies rules and elabo- 
rates specific techniques very similar to the work of Carlson (Modeling XML applications 
with UML, Addison- Wesley, 2001). The examples shown here are 'handmade' translations 
of the UML model. Automated translation will be discussed in the next version of Charac- 
terisation. The derivation of the XML from the UML model is expressed in the graphical 
views of the XML schema in Figs. 7, 8, 10, 11 and 12. 
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Figure 7: ITie CharacterisationAxis element is built up following the corre- 
sponding UML class with coordsystem and ObsyLoc items reusing STC elements. 
The small arrow on cha:numbins represents a substitution group head element in 
XML. This allows to plug various constructs of this element (e.g. for ID, 2D, 3D) 
that play the same role in the XML tree. 
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Figure 8: T/ie coordsystem and unit items can be factorised at the top of the 
Coverage structure, but may be redefined at each level when necessary. Bounds 
are expressed using a limits element which is developped on a general bounding 
box type: Char Coord Area. AreaType is a string describing the kind of region 
used: Circle, Polygon etc. 
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Figure 9: Representing limits: The two expressions allowed for a bounding box 
are expressed using either a STC: Coordlnterval embedded in a locally defined type 
cha.Tnterval or built on another type: CharBox representing a generic centered box 
in ^-dimensions. 
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Figure 10: This graphical view was generated with XMLSPY from the resolution 
element of the schema. As designed in the UML class, the resolution item may 
contain 4 possible subelements. The RefVal element should be present but is not 
mandatory: some observations may have unknown resolution. 
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Figure 11: The samplingPrecision item contains 4 possible subelements. 
One among Sampling PrecisionRefVal and Sampling PrecisionBounds should 
be present when possible but this is not explicitly described by the XML syntax. 
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Figure 12: T/ie accuracy element relies on Errors along the axes and is built 
up on STC elements. 
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5.2 Utypes generation: select one ordering strategy 



One application of such a model is to provide a naming convention for every metadata 
considered within the model, in order to be able to identify one concept in various models 
or serialisations. The idea is that by navigating in the model following the logical links 
provided, it is possible to construct identifiers called Utypes that could be understood by 
any VO tool aware of the model. To avoid multiplicity, the Utypes are built from the XML 
schema representation of the model which already enforces a hierarchical structure. For in- 
stance, the size of the sampling clement along the spatial axis in a 2D image corresponds to: 
Characterisation.spatialAxis.samplingPrecision.samplingPrecisionRefVal.sampleExtent 
The full list of Utypes derived from this model (versions l.lx of the model) is stored at 
http : //www . ivoa . net /Document s/lat est /UtypeListCharacterisat ionDM . html 

5.2.1 VOTABLE serialisation 

A VOTABLE serialisation of the characterisation of the IFU MPFS data set is shown in 
Appendix C. Each CharacterisationAxis is shown as a table, where each property itself 
is shown as a Group of FIELDS. UML class attributes are serialised as FIELDS (except 
if they have a detailed STC structure; in that case they are translated as a group of 
FIELDS). In this example, Utypes are set for each Table, Group, and Field according to 
the following rule: 

A Utype is elaborated for each VOTable item in the serialisation as a 
string based on instance variable paths in our object-oriented datamodel. 

Other ways of deriving utypes from a valid Xpath to the equivalent XML element in 
the XML Characterisation schema have been studied. The main difference is that this 
option may use constrained element (or attribute) values in the Utype path. The IVOA 
needs to define a single and robust rule to define this concept. 



A Appendix A: XML serialisation example 

An XML instance document representing the characterisation of an IFU data set, taken 
with the Russian MPFS instrument. It relies on the XML schema mentioned above. See 
the corresponding XML document at : 

http : //www. ivoa.net/internal/IVDA/CharacterisationDataModel/MPFS-vl . 11 .xml. 

B Appendix B: VOTable serialisation example 

An alternative serialisation, using the VOTable format and applying the Utype mecha- 
nism to map the various items to the Characterisation Data Model classes and attributes. 
Utypes are derived from the Characterisation XML schema as mentioned above. See the 
full XML document at : 

http : //www . ivoa . net/internal/IVDA/CharacterisationDataModel/MPFSVOt- vl .11. xml 
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C Appendix C: Characterisation of various dataset 
properties 
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Axes 

Properties 


Spatial 


Temporal 


Spectral 


Observable 
e.g. Flux 


Coverage 










Location 


Central 
position 


]VIid.-tinie 


Central 
wavelength 


Average flux 


Rounds 


RA, Dec [min,max] 
or Bounding box 
[center, size] 


Start /stor) 
time 


Wavelen frhh 
[min, max] 


Saturation 
Limiting flux 


Support 


FOV as 

array of polygons 


Time intervals 
(array) 


Wavelength 

intervals 

(array) 




Sensitivity 


Quantum efficiency 

(x,y) 




Transmission 
curve (A) 


Function 
property 
e.g. linearity 


Filling 
factor 


Effective/ 
Total area 


Live time 
fraction 






Resolution 


PSF (x,y) 
or its FWHM 


Duration 
per image 


Band 
FWHM 


Flux SNR 

(stat error) 


Sampling 
Precision 


Pixel scale 

(x,y) 


Duration 
per image 


Band 
FWHM 


(1 ADU 
equivalent = 
Quantization) 



Table 2: Property versus Axis description of metadata describing a 2D 
optical image. This represents a single integration or indivisible stack of 
exposures, taken in a single broad- band filter, so the spectral resolution is the 
same as the filter FWHM. 



33 



Axes 

Properties 


Spatial 


Temporal 


Spectral 


Observable 
e.g. Flux 












Location 


Central 
position 


Mid-Time 


Central 
wavelength 


Average 
flux 


Bounds 


Slit RA, Dec 
[min, max] or 
BonuHinp' hoy 


Start/stop time 


Wavelength 
[min, max] 


Saturation, 

Limiting 

flux 


Support 


Slit as accurate 
array of 
nolvpTJus 

i Jul J B ull ° 


Time 
(intervals) 


Wavelength 
intervals 


Lowest and 

highest 

value 


Sensitivity 


Response (x,y) 
along slit 




Quantum 
efficiency 
(A) 


Function 
property 
e.g. Linearity 


Filling 
factor 


Effective/ 
Total area 


Live time 
fraction 






Resolution 


Slit 
area 


Min. extractable 
interval 


LSF or its 
FWHM 


FluxSNR 
(stat error) 


Sampling 
Precision 


Slit 
area 


Min. extractable 
interval 


Pixel scale 
in A 


(1 ADU 

equivalent 

Quantization) 



Table 3: Property versus Axis description of metadata describing a ID- 
Spectrum. 
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Axes 

Properties 


Spatial 


Temporal 


Spectral 


Observable 
e.g. Flux 


Coverage 










IjOCdXIOIl 


position 


ivim- l mie 


v^eiiLi d>i 
wavelength 
(all spectra) 


i-iV6I dgc 

flux 




r itiiu. 
RA, Dec 
[min, max] 


Oldi L / oLUp 

time 


\A/o T TCt 1 on rTT In 

vvaveieiigLii 
[min, max] 
(all spectra) 


odl UI aLlUIl, 

Limiting 
flux 


Support 


Union of fiber 
footprints 
on the sky 


Time 

intervals 

(array) 


Disjoint 

wavelength 

intervals 


Lowest and 

highest 

value 


Sensitivity 


Response (x,y) 

along 

the slit 




Quantum 
efficiency 
(A) 


Function 
property 
e.g. Linearity 


Filling 

factor 


Effective/ 
Total area 


Live time 
fraction 






Resolution 


PSF (x,y) 
or its 
FWHM 


Min. 

extractable 
interval 


LSF 
or its 
FWHM 


Flux SNR 
(stat error) 


Sampling 
Precision 


Pixel scale 

(x,y) 


Min. 

extractable 
interval 


Pixel 
scale 
in A 


(1 ADU 

equivalent 

Quantization) 



Table 4: Property versus Axis description of metadata describing 3D IFU 
data. These are taken using a mask of multiple slits or fibres each focusing 
a separate spectrum onto a single detector array. The Support comprises 
multiple discrete intervals in all dimensions, into which data products could 
be decomposed. The spatial resolution is determined by the telescope aperture 
(and the seeing) which spreads the incident radiation over several CCD pixels; 
the resolution and pixel scales impose different constraints on downstream 
data analysis. 
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Axes 

Properties 


Spatial 


Temporal 


Spectral 


Observable 
e.g. Flux 


Coverage 










Location 


Central 
position 


Mid- Time 


Central 
Frequency 


Average flux 


Bounds 


RA,Dec [min,max] 
or Bounding box 
[center, size] 


Start / stop 
time 


Frequency 
[min, max] 


Saturation, 
rms noise 


Support 


Primary beam 
FWHM 

(or mosaic polygons) 


Time intervals 
(array) 


Frequencies 
(array) 


Peak, 
3<T rms 


Sensitivity 


Smearing limits/ 
functions (of integ. 
time/ chan. width) 


Gain- 
elevation 


Bandpass 
function(s) 
or FWHM(s) 


Dynamic 
range 


Filling 
factor 


Fraction 
of mosaic 
filled 


Live time 
fraction 


Fraction 
above FWHM 
sensitivity 




Resolution 


Spatial scales 
(max and min of 
BMaj, BMin, BPA) 


Min. imageable 
duration 


FWHM of 

Hanning 

smoothing 


RMS noise 


Sampling 
Precision 


Pixel scales 
[min, max] 


Integration 
time 


Channel 
width 





Table 5: Property versus Axis description of metadata describing a radio 
image service, potentially mosaiced. The Max. and Min. spatial resolu- 
tions arise from the shortest and longest baselines present; any intermediate 
value may be selected when an image is extracted from visibility data. The 
spectral resolution may be coarsened by smoothing to minimise artefacts. 
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Axes 

Properties 


Spatial 


Temporal 


Spectral 


Observable 
e.g. Flux 


Coverage 










Location 


Central 
position 
(0, 0) 


Mid- Time 
(0) 


Central 
Frequency 


Average flux 


Bounds 


Bounding box 
[center, size] 


Relative 
start/stop time 


Frequency 
[min,max] 


Saturation, 
rms noise 


Support 


FOV as array 
of polygons 


Time interval 


Frequencies 




Sensitivity 


Quantum efficiency 

(x, y) 




Transmission 
curve 


Detector 
linearity 


Filling 

factor 


Effective/ 
Total area 


(100%) 






Resolution 


PSF 
FWHM 


Duration 


Band 
FWHM 


Noise 
error 


Sampling 
Precision 


Pixel scales 

[x, y] 


Duration 


Band 
FWHM 


Quantization 



Table 6: Property versus Axis description of metadata describing a simu- 
lated CCD observation in a single band. The spatial coordinates may be 
expressed in (x, y) independent of celestial position. 
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D Appendix D: Requirements for Data Model 
compliance 

D.l Limitations in this version 

The first three levels of Characterisation are now fully described and take explicit values. 
The fourth level of the structure can contain functions (e.g. the variation of noise with 
position) or URLs (e.g. the location of a weight map). Data providers may have varying 
expectations about how these advanced metadata should be delivered, so we will expand 
the description of this level in a future version of the model, after polling the community 
for the use of weight maps, variability maps, etc... We anticipate that the first three levels 
will answer more than 70% of present needs. 

It is not yet possible to implement rules linking coverage on different axes. For ex- 
ample, if a survey consists of spatially distinct fields, observed in several wavebands, but 
there are fields which do not contain all wavebands, then each field and/or each waveband 
must be described separately. Similarly, separate descriptions are required if resolution or 
noise (for instance) behave differently in various areas of Support. 

D.2 Implementing Characterisation 
D.2.1 Data Providers 

Several tools are being developed to assist data providers supply metadata. These include 
extraction of information from FITS headers and a form interface called CAMEA which 
allows the user to enter values for Characterisation elements and translates this to XML. 
We will also provide XML templates for manual editing. We will investigate what would 
be more convenient for large data collections depending on how they store their existing 
metadata. 

Metadata required by Characterisation might be extracted from a number of sources 
such as: 

• An archive database; 

• An observing log or other description which might be stored in a database or as 
ascii, xml or other documents; 

• FITS headers, which provide more or less direct routes: 

— Unambiguous identification between e.g. a database column or FITS keyword 
and a Characterisation element; 

— Correspondence with formulaic modification, e.g. adding explicit units or 
calculating the field of view of an interferometer; 

— A separate information source e.g. resolution of the telescope using different 
frequencies / configurations; 

— Offine/human memory /judgement 

The following sections outline our proposals for which of the status strings MANDA- 
TORY, RECOMMENDED or OPTIONAL should be applied to each element of the XML 
schema. The status strings are used as in the SIAP proposed recommendation (http : 
//www. ivoa.net/Documents/WD/SIA/sia-20040524.html), interpreted as follows: 

• MANDATORY means that the metadatum is fully required to make the data usable 
in basic VO services 
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• RECOMMENDED means that this item should be given if at all possible to improve 
the interpretation of the data or their use in a wider range of VO services 

• OPTIONAL means that such metadata elements would help to give more precise 
interpretation but are not vital. 

An implementation may have one among the following level of compliance and be : 

• "partially compliant", if it implements some (but not all) MANDATORY/MUST 
elements 

• "compliant" , if it implements all MANDATORY/MUST elements 

• "fully compliant", if it implements all MANDATORY/MUST elements and all 
RECOMMENDED/SHOULD ones 

The prime goal is to get this model applied by data providers in useful ways. We should 
make it as easy as possible to describe any kind of observed or simulated data by minimising 
the number of compulsory fields. At the same time we must encourage data providers to 
give enough information to expand the ways in which data can be selected or manipulated 
by VO tools currently or imminently available. 

D.3 Requirements for compliance 
D.3.1 General considerations 

On each axis, the first three levels (Location, Bounds, Support), must be given explicit 
numerical values (or arrays of values) in order to be accessible to any tool. Other elements 
may be given numerical values, or functions, or indirect references (URIs) but these are 
in general not used at present. 

Users are strongly encouraged to evaluate coarser levels of description explicitly even 
if they also provide finer levels. We need to decide what users do if they are not giving a 
value for an element e.g. leave blank, consistent with other models. 

Location, Bounds and other higher Characterisation levels are intentionally approxi- 
mations to provide a simple inclusive description of the data. 

The Location value may be determined with some error, which might be mentioned 
inside the STC structure used for Coordinates. This is to be distinguished from Bounds 
which should be the outer limits to anywhere data might be found. 

Accuracy properties describe uncertainties in the mapping process of data values along 
axes, see Section D.7. 

The values for some elements must be given as arrays, defined as in STC, and the 
required number of arguments must be present if any are. Bounds, for instance, describes 
a unique region on an axis as e.g. (al, SI; a2, 52), whilst the ResolutionSupport is given 
relatively e.g. telescope beam major and minor axes in arcsec and position angle. 

D.3.2 Defaults 

Defaults might sometimes be possible for values which have not been provided. We do not 
think that such defaults should be coded into the description, rather that software which 
looks for the value of a missing element might be able to make an intelligent assumption. It 
is up to the writers of a software tool specification to decide whether it is more dangerous 
to use defaults and risk a lower level of accuracy, or to ignore data which is not adequately 
specified and thus loose potentially important information. 
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For example, if Location is not given then, for some axes, software may take the 
default Location as the mid-point of the Bounds 7 . 

If Bounds are not given then e.g. if a spatial axis has Coordsys ICRF some software 
might assume all-sky coverage 8 . 

If Support is not given Bounds, if present, could be used. 

If the unit or Coordsys clement is not given for any level, the values for the Char- 
acterisationAxis are used; be careful, as this may be unsuitable (e.g. if the Character- 
isationAxis units are sexagesimal, then a single number for an error could be in degrees 
or arcsec or ...). 

D.4 Axes 

It is MANDATORY to provide at least one axis (coded as a CharacterisationAxis ele- 
ment). All three of the Space, Time and Spectral Coverage axes are RECOMMENDED 9 . 

The unit and coordinate system are MANDATORY for each Axis present. These may 
be relative to an internal reference only, e.g. pixel spatial coordinates. In such a case 
both the Location and Bounds are MANDATORY for that axis. Note that STC allows 
'RELOCATABLE', for example as a valid Location for simulated data, unless this is 
incompatible with the specified coordinate system. 

Space-, Spectral- and Time-related axes, and most other potential axes, are already 
defined in STC; where this is the case, it is MANDATORY to use the STC coordinate 
system and unit definitions. Various cases of how to re-use STC elements are shown in 
the example XML documents provided. The Observable Axis is RECOMMENDED 10 . 

Axes which are not yet defined in STC (such as Polarization at the present time) are 
OPTIONAL but a reference to the definition of the proper Coordinate System should be 
given. 

D.4.1 Axis Flags 

For each CharacterisationAxis clement (spatial, spectral, observable, etc.): 
A flag to indicate if it is an independent or a dependent variable ('true' or 'false') is REC- 
OMMENDED. 

A flag to indicate its calibration status is RECOMMENDED: 

CALIBRATED, UNCALIBRATED, RELATIVE, NORMALIZED; default UNCALIBRATED. 
Flags to indicate SamplingStatus are OPTIONAL (these are RECOMMENDED where 
they are customarily relevant): 

• undersamplingStatus ('true' or 'false') 

• regularsamplingStatus ('true' or 'false') 



7 this might be complicated (e.g. some spatial coordinates) or impossible 

8 a more restricted coverage might be derived once there is a link to Observation and 

the telescope location 

9 some might be considered irrelevant for simulated data, or not conventionally provided 

e.g. for old spectra with no time stamp 

10 its omission may seem reasonable for e.g. the coverage intended for a future survey 
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D.5 Coverage 



For each CharacterisationAxis, it is MANDATORY to give either the Location or the 
Bounds elements. Both Location and Bounds are RECOMMENDED if these are avail- 
able. 

Support is RECOMMENDED; if it is given then it is MANDATORY also to give the 
Bounds 11 . 

Sensitivity 12 (e.g. the URI of a weight map, or a function) is OPTIONAL. 
The Unit and/or CoordSystem is OPTIONAL for each of these coverage layers; if not 
given, they will default to the units and CoordSystem used for the Characterisation- 
Axis element (i.e. when the axis was first defined). 

D.6 Other Properties: Resolution and Sampling Preci- 
sion 

Resolution and SamplingPrecision relate to a specific Coverage along one Charac- 
terisationAxis. They are organised according to progressive levels of description as in 
Coverage but themselves contain the relevant layers, e.g., for some axis: 

• at level 1: resolutionRefVal instead of Location stands for a typical or average 
value for the resolution as in Spectral. Resolution. resolutionRefVal 

• at level 2: Bounds contains the lowest and highest values present as in Spa- 
tial. Resolution. resolutionBounds 

• at level 3: Support represents sets of discrete ranges of sampling intervals as in 
Spectral. SamplingPrecision. Support 

• at level 4: resolutionVariability stores the variability of resolution with position 
on the axis as in Spatial. Resolution. Variability 

If there are many areas of Support within the Coverage, the Accuracy, Resolu- 
tion and SamplingPrecision should refer to the inside of each Support area. However, 
in this version of the model, it is assumed that, in principle, on any one axis, one descrip- 
tion of each of these properties applies to all Support areas, otherwise each area must be 
described in a separate Characterisation tree description (see Section D.l). 

The Resolution and/or SamplingPrecision are OPTIONAL; if they are present, it 
is MANDATORY to give the unit and Coordsys on axes where the units of the Character- 
isationAxis would not make sense or are ambiguous; otherwise the CharacterisationAxis 
values are used. The unit and Coordsys are OPTIONAL for any level of Accuracy, 
Resolution or Sampling, otherwise the value defined at the start of the Accuracy, 
Resolution or Sampling definitions is used. 

11 If different areas of Support apply on different axes, a separate description should be 
used at the level where each subset of data can be described unambiguously, see Section 
D.l 

12 Here, Sensitivity is the dependence of a detector response or equivalent with position 
on the given axis. This is not the limiting sensitivity in the sense of the faintest detectable 
flux, which is given by the lower Bound of the Observable axis. 
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D.6.1 Resolution 



If Resolution is present, then it is MANDATORY to give the ResolutionRefVal (i.e. 
Location). ResolutionBounds are RECOMMENDED. The ResolutionSupport. and 
ResolutionVariability (as a function of position on that axis) are OPTIONAL. 

D.6.2 Sampling Precision 

If SamplingPrecision is present, it is MANDATORY to give a samplingPrecision- 
RefVal (i.e. Location) which contains both samplingPeriod and sampleExtent. It 
is MANDATORY to provide the samplingPeriod, whilst an explicit sampleExtent is 
RECOMMENDED but it is not required. 

SamplingPrecisionBounds, SamplingPeriodLimits and sampleExtentLimits 
are also RECOMMENDED. 

The SamplingPrecision. Support and related values for the samplingPeriod and/or 
the sampleExtent are OPTIONAL. The SamplingPrecision. Variability (i.e. Sensi- 
tivity) (in the form of a samplingPrecisionMap to describe variations along an axis) is 
OPTIONAL. 

The FillFactor is RECOMMENDED for any axis where the actual coverage in each 
Support region is significantly less than 1 but the filling is too complex to be described 
practically using Sampling 13 . 

The FillFactor of the SamplingPrecison is OPTIONAL; if it is present and if 
SamplingPeriod and SampleExtent are also given, then logically: 
FillFactor — SampleExtent/ SamplingPeriod 

and the data provider should take care that the values and units given are consistent with 
this relationship. 

D.7 Accuracy 

Accuracy values for the precision of measurements are RECOMMENDED for each Char- 
acterisationAxis, divided into statistical and systematic uncertainties (or appropriate al- 
ternative definitions of uncertainties) . For each CharacterisationAxis where Accuracy is 
provided: 

• It is MANDATORY to give the unit and Coordsys on axes where the units of 
the CharacterisationAxis would not make sense or are ambiguous, otherwise the 
CharacterisationAxis values are used. 

• The unit and Coordsys are OPTIONAL for any axis 14 . 

• It is MANDATORY to give the ErrorRefVal (typical value). 

• The ErrorBounds arc OPTIONAL for uncertainties which vary along the domain 
of the axis. 

• The URI of an ErrorMap which describes the variation of errors with location is 
OPTIONAL. 

3 FillFactor applies to the usable fraction of data within each Support presently 
defined. If we find that the majority of users want it to be the useful fraction of the whole 
Bounds, the name and definition will be changed in a future version. 

14 for example normalised units such as a flux accuracy of 0.03 given flux measurement. 
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Appendix E: Updates of this document 



version 0.9 

— full re-organisation, rewriting and simplification of the Model description, 
Anita Richards April/May 2006 

— new XML schema, F.Bonnarel 

version 0.93 

— update XML schema and use STC blocs elements 

— add examples of characterisation metadata for various data sets and various 
dimensions 

version 0.96 to 0.99 

— finalise the XML schema and interface it with the STC XML schema 

— update the document format according to the IVOA master document. 

version 1.0 

— revision by the authors. Fill factor, Support discussions: minor formulation 
changes 

— add Appendix D: Requirements for Data Model compliance prepared by Anita 
Richards and discussed at the Moscow interoperability meeting 

version 1.1 

— modify data model for more compatibility to Spectrum data model : remove 
AxisFrame class not present in Spectrum DM. 

— modify schema and figures accordingly 

version 1.11 

— make the document compatible to IVOA referencing mecanism: update links. 

— list up changes of the various draft versions in the present document 

version 1.12 

Changes following the RFC and teg comments and requirements 

— Schema reference:update links to XML schema 
http: //www. ivoa.net/xml/Characterisation-vl . 11 

— update ref to examples MPFS : 

http : //alinda.u-strasbg. fr /Model/Char acterisation/examples/MPFS-vl . 11 . xml 

— change Fig.l about interactions between Characterisation and other models. 
Observation class gets more obvious and inclusion of Char inside Observation 
DM too. 

— update reference to the Quantity DM: simplify 

— introduce Spectrum DM reference in section 2.3 'Links to other efforts' 

— include a Scope section at the beginning of the document 

— Appendix E: include the document revision history directly in this file instead 
of pointing to an external file 

— use smaller fonts for figure captions 

March 2008, 25: corrected the links to the UtypeList document 
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